The Open University ’ s repository of research publications and other research outputs Aggregating Research Papers from Publishers ’ Systems to Support Text and Data Mining : Deliberate Lack of Interoperability or Not ?
نویسندگان
چکیده
In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all research papers, i.e. the collection of human knowledge so large no one can ever read in their lifetime, represents one of the most exciting opportunities for TDM. Although the Open Access movement, which has been advocating for free availability and reuse rights to TDM from research papers, has achieved some major successes on the legal front, the technical interoperability of systems offering free access to research papers continues to be a challenge. COnnecting REpositories (CORE) (Knoth and Zdrahal, 2012) aggregates the world’s open access full-text scientific manuscripts from repositories, journals and publisher systems. One of the main goals of CORE is to harmonise and pre-process these data to lower the barrier for TDM. In this paper, we report on the preliminary results of an interoperability survey of systems provided by journal publishers, both open access and toll access. This helps us to assess the current level of systems’ interoperability and suggest ways forward.
منابع مشابه
Aggregating Research Papers from Publishers’ Systems to Support Text and Data Mining: Deliberate Lack of Interoperability or Not?
In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all resear...
متن کاملCross - Platform Text Mining and Natural Language Processing Interoperability PROCEEDINGS
In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all resear...
متن کاملThe Open University ’ s repository of research publications and other research outputs Can text structure be incompatible
متن کامل
The Open University ’ s repository of research publications and other research outputs How do some concepts vanish over time ?
This paper presents the current stage of my PhD research focused on the use of machine learning in supporting the human learn from examples. I present here an approach to answer how some concepts change their contexts in time, using two techniques suitable for indexing and data mining: latent semantic indexing (LSI) and the APRIORI algorithm.
متن کاملThe Open University ’ s repository of research publications and other research outputs Understanding research dynamics
Rexplore leverages novel solutions in data mining, semantic technologies and visual analytics, and provides an innovative environment for exploring and making sense of scholarly data. Rexplore allows users: 1) to detect and make sense of important trends in research; 2) to identify a variety of interesting relations between researchers, beyond the standard co-authorship relations provided by mo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016